Human action recognition based on multi-scale feature maps from depth video sequences
نویسندگان
چکیده
Human action recognition is an active research area in computer vision. Although great progress has been made, previous methods mostly recognize actions from depth video sequences at only one scale, and thus they often neglect multi-scale spatial changes that provide additional information practical applications. In this paper, we present a novel framework with mechanism to improve scale diversity of motion features. We propose feature map called Laplacian pyramid images(LP-DMI). First, employ images (DMI) as the templates generate static representation actions. Then, caculate LP-DMI enhance dynamic motions reduce redundant human bodies. further extract multi-granularity descriptor LP-DMI-HOG more discriminative Finally, utilize extreme learning machine (ELM) for classification. The proposed method yeilds accuracy 93.41%, 85.12%, 91.94% on public MSRAction3D, UTD-MHAD DHA dataset. Through extensive experiments, prove our outperforms state-of-the-art benchmarks.
منابع مشابه
Multi-Scale Locality-Constrained Spatiotemporal Coding for Local Feature Based Human Action Recognition
We propose a Multiscale Locality-Constrained Spatiotemporal Coding (MLSC) method to improve the traditional bag of features (BoF) algorithm which ignores the spatiotemporal relationship of local features for human action recognition in video. To model this spatiotemporal relationship, MLSC involves the spatiotemporal position of local feature into feature coding processing. It projects local fe...
متن کاملMulti-Features Encoding and Selecting Based on Genetic Algorithm for Human Action Recognition from Video
In this study, we proposed multiple local features encoded for recognizing the human actions. The multiple local features were obtained from the simple feature description of human actions in video. The simple features are two kinds of important features, optical flow and edge, to represent the human perception for the video behavior. As the video information descriptors, optical flow and edge,...
متن کاملVideo Matting from Depth Maps
Image matting is the process of computing an alpha value for each pixel that corresponds to the amount of the pixel that is foreground and the amount that is background. This is often used to isolate the foreground and replace the background with a different image. Typically this is done using a special studio and a blue (or green) screen for easier segmentation. However, this method is not as ...
متن کاملVideo Matting from Depth Maps
Image matting is the process of taking an image, isolating the foreground, and replacing the background with a new image. This can be a hard problem when the background is unknown as it is fundamentally unconstrained. We look at an existing technique for foreground-background separation called Bayesian matting and improve upon it by adding depth information acquired by a time-of-flight range sc...
متن کاملAction Recognition using Temporal Bag-of-Words from Depth Maps
In this paper, we present a methodology for human action recognition from a sequence of depth maps obtained using Microsoft Kinect. Specifically, we use a Temporal Bag-of-Words model as representation scheme to capture the variation of features across the temporal domain. Our methodology builds the Temporal Bag-of-Words model on top of the spatiotemporal features extracted from interest points....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Multimedia Tools and Applications
سال: 2021
ISSN: ['1380-7501', '1573-7721']
DOI: https://doi.org/10.1007/s11042-021-11193-4